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TITLE OF THE INVENTION 
IMAGE PROCESSING APPARATUS AND IMAGE PROCESSING METHOD 

FIELD OF THE INVENTION 
5 The present invention relates to an image 

processing apparatus and image processing method, which 
applies an image process to a plurality of images. 

BACKGROUND OF THE INVENTION 

10 An attempt has been made to sense a real space by 

an image sensing apparatus mounted on a mobile object, 
and to express the sensed real space as a virtual space 
using a computer on the basis of the sensed 
photo-realistic image data (see, e.g., Endo, Katayama, 

15 Tamura, Hirose, Watanabe, & Tanikawa: "Method of 
Generating Image-Based Cybercities By Using 
Vehicle -Mounted Cameras" (IEICE Society, PA- 3 -4, 
pp. 276-277, 1997), or Hirose, Watanabe, Tanikawa, Endo, 
Katayama, & Tamura: "Building Image-Based Cybercities 

20 By Using Vehicle -Mounted Cameras ( 2 ) -Generation of 

Wide-Range Virtual Environment by Using Photo-realistic 
Images-" (Proc. of the Virtual Reality Society of Japan, 
Vol.2, pp. 67-70, 1997), etc.). 

As a method of expressing a sensed real space as 

25 a virtual space on the basis of photo-realistic image 
data sensed by an image sensing apparatus mounted on a 
mobile, a method of reconstructing a geometric model of 
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the real space on the basis of the photo-realistic 
image data, and expressing the virtual space using a 
conventional CG technique is known. However, this 
method has limits in terms of the accuracy, exactitude, 
5 and reality of the model. On the other hand, an 

Image-Based Rendering (IBR) technique, that expresses a 
virtual space using a photo-realistic image without any 
reconstruction using a model, has attracted attention. 
The IBR technique composes an image viewed from an 

10 arbitrary viewpoint on the basis of a plurality of 
photo-realistic images. Since the IBR technique is 
based on photo-realistic images, it can express a 
realistic virtual space. 

In order to create a virtual space that allows 

15 walkthrough using such IBR technique, an image must be 
composed and presented in correspondence with the 
position in the virtual space of the user. For this 
reason, in such systems, respective frames of 
photo -realistic image data and positions in the virtual 

20 space are saved in correspondence with each other, and 
a corresponding frame is acquired and reproduced on the 
basis of the user's position and visual axis direction 
in the virtual space . 

As a method of acquiring position data in a real 

25 space, a positioning system using an artificial 

satellite such as GPS (Global Positioning System) used 
in a car navigation system or the like is generally 



- 2 - 



CFM03466/P204-0025 



used. As a method of determining correspondence 
between position data obtained from the GPS or the like 
and photo-realistic image data, a method of determining 
the correspondence using a time code has been proposed 
5 (Japanese Patent Laid-Open No. 11-168754). With this 
method, the correspondence between respective frame 
data of photo-realistic image data and position data is 
determined by determining the correspondence between 
time data contained in the position data, and time 

10 codes appended to the respective frame data of 
photo-realistic image data. 

The walkthrough process in such virtual space 
allows the user to view a desired direction at each 
position. For this purpose, images at respective 

15 viewpoint positions may be saved as a panoramic image 
that can cover a broader range than the field angle 
upon reproduction, a partial image to be reproduced may 
be extracted from the panoramic image on the basis of 
the user's position and visual axis direction in the 

20 virtual space, and the extracted partial image may be 
displayed. 

When the image sensing apparatus is shaken, a 
panoramic image is also shaken. In such case, when 
shakiness of the image sensing apparatus is prevented 
25 by physical means such as a special vibration isolation 
device, rail, or the like, the image sensing apparatus 
cannot be freely moved, and the image sensing 
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conditions are restricted- It is impossible in 
principle for the method using such physical means to 
reduce shakiness of the already sensed video. 

When a video image process is used, shakiness of 
5 the already sensed video can be reduced. For example, 
when feature points in an image are detected, and are 
traced across a plurality of frames, the position and 
posture of a camera can be estimated on the basis of a 
set of the traced feature points by geometric 

10 calculations such as factorization or the like. 

Conventionally, such estimation of the position and 
posture of the camera can be implemented using 
commercially available match moving software. If the 
position and posture of the camera in each frame of the 

15 video can be estimated, a shakiness of the video can be 
reduced on the basis of the obtained estimated values 
of the position and posture of the camera. 

The video image process using match moving 
software or the like, however, cannot simultaneously 

20 estimate the positions and postures of a plurality of 
cameras. Also, the estimated values of the position 
and posture of the camera calculated by the video image 
process contain errors. For this reason, when 
shakiness of images sensed by a plurality of cameras 

25 are reduced by the video image process for each camera, 
and the processed images are stitched to form a single 
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panoramic image, the degree of overlapping of the seams 
between neighboring images varies for respective frames . 

SUMMARY OF THE INVENTION 
5 The present invention has been made in 

consideration of the aforementioned problems, and has 
as its principal object to reduce a shakiness of a 
panoramic video. 

According to an aspect of the present invention, 

10 an image processing method comprising: setting a common 
coordinate system which can be transformed from 
individual coordinate systems of a plurality of image 
sensing devices; estimating postures of at least one of 
the plurality of image sensing devices ; calculating an 

15 estimated posture of the common coordinate system using 
at least one of the estimated posture of the plurality 
of image sensing devices; calculating a correction 
transform for reducing a shakiness of the common 
coordinate system using the estimated posture of the 

20 common coordinate system; calculating a correction 
transform for reducing a shakiness of each of the 
plurality of image sensing devices using the correction 
transform; applying the corresponding correction 
transform to a sensed image which is sensed by each of 

25 the plurality of image sensing devices; and composing a 
panoramic image by joining a plurality of transformed 
sensed images . 
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According to another aspect of the present 
invention, an image processing method comprising: 
setting a common coordinate system which can be 
transformed from individual coordinate systems of a 
5 plurality of image sensing devices; estimating postures 
of at least one of the plurality of image sensing 
devices; calculating an estimated posture of the common 
coordinate system using at least one of the estimated 
posture of the plurality of image sensing devices; 

10 calculating a correction transform for reducing a 

shakiness of the common coordinate system using the 
estimated posture of the common coordinate system; 
composing a panoramic image by joining a plurality of 
sensed images, which are sensed by the plurality of 

15 image sensing devices; and applying the correction 
transform for reducing a shakiness of the common 
coordinate system to the panoramic image. 

According to a further aspect of the present 
invention, an image processing apparatus comprising: 

20 setting unit adapted to set a common coordinate system 
which can be transformed from individual coordinate 
systems of a plurality of image sensing devices; 
estimation unit adapted to estimate postures of at 
least one of the plurality of image sensing devices; 

25 first calculation unit adapted to calculate an 

estimated posture of the common coordinate system using 
at least one of the estimated posture of the plurality 
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of image sensing devices; second calculation unit 
adapted to calculate a correction transform for 
reducing a shakiness of the common coordinate system 
using the estimated posture of the common coordinate 
5 system; third calculation unit adapted to calculate a 
correction transform for reducing a shakiness of each 
of the plurality of image sensing devices using the 
correction transform; application unit adapted to apply 
the corresponding correction transform to a sensed 

10 image which is sensed by each of the plurality of image 
sensing devices; and composition unit adapted to 
compose a panoramic image by joining a plurality of 
transformed sensed images . 

According to still further aspect of the present 

15 invention, an image processing apparatus comprising: 

setting unit adapted to set a common coordinate system 
which can be transformed from individual coordinate 
systems of a plurality of image sensing devices; 
estimation unit adapted to estimate postures of at 

20 least one of the plurality of image sensing devices; 
first calculation unit adapted to calculate an 
estimated posture of the common coordinate system using 
at least one of the estimated posture of the plurality 
of image sensing devices; second calculation unit 

25 adapted to calculate a correction transform for 

reducing a shakiness of the common coordinate system 
using the estimated posture of the common coordinate 
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system; composition unit adapted to compose a panoramic 
image by joining a plurality of sensed images, which 
are sensed by the plurality of image sensing devices; 
and application unit adapted to apply the correction 
5 transform for reducing a shakiness of the common 
coordinate system to the panoramic image . 

According to yet further aspect of the present 
invention, a computer program for making a computer 
function as an image processing apparatus of the 

10 present invention or a computer readable storage medium 
storing the computer program. 

According to another aspect of the present 
invention, an imaging apparatus comprising: a plurality 
of image sensing devices; a processor for composing a 

15 stabilized panoramic image; and a display device for 
displaying the panoramic image, wherein the processor 
composes the panoramic image by performing the steps 
of : setting a common coordinate system which can be 
transformed from individual coordinate systems of a 

20 plurality of image sensing devices; estimating postures 
of at least one of the plurality of image sensing 
devices; calculating an estimated posture of the common 
coordinate system using at least one of the estimated 
posture of the plurality of image sensing devices; 

25 calculating a correction transform for reducing a 

shakiness of the common coordinate system using the 
estimated posture of the common coordinate system; 
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calculating a correction transform for reducing a 
shakiness of each of the plurality of image sensing 
devices using the correction transform; applying the 
corresponding correction transform to a sensed image 
5 which is sensed by each of the plurality of image 

sensing devices; and composing the stabilized panoramic 
image by joining a plurality of transformed sensed 
images . 

According to a further aspect of the present 

10 invention, an imaging apparatus comprising: a plurality 
of image sensing devices; a processor for composing a 
stabilized panoramic image; and a display device for 
displaying the panoramic image, wherein the processor 
composes the panoramic image by performing the steps 

15 of: setting a common coordinate system which can be 

transformed from individual coordinate systems of the 
plurality of image sensing devices; estimating postures 
of at least one of the plurality of image sensing 
devices; calculating an estimated posture of the common 

20 coordinate system using at least one of the estimated 
posture of the plurality of image sensing devices; 
calculating a correction transform for reducing a 
shakiness of the common coordinate system using the 
estimated posture of the common coordinate system; 

25 composing a panoramic image by joining a plurality of 
sensed images , which are sensed by the plurality of 
image sensing devices; and applying the correction 



- 9 - 



t 



* • 

CFM03466/P204-0025 

transform for reducing a shakiness of the common 
coordinate system to the panoramic image in order to 
compose the stabilized image. 

Other features and advantages of the present 
5 invention will be apparent from the following 

description taken in conjunction with the accompanying 
drawings, in which like reference characters designate 
the same or similar parts throughout the figures 
thereof . 

10 

BRIEF DESCRIPTION OF THE DRAWINGS 
The accompanying drawings , which are incorporated 

in and constitute a part of the specification, 

illustrate embodiments of the invention and, together 
15 with the description, serve to explain the principles 

of the invention. 

Fig. 1 is a view for explaining transformation 

for estimating the position and posture of the virtual 

panoramic camera for each camera; 
20 Fig. 2 is a view for explaining transformation 

for estimating the position and posture of the virtual 

panoramic camera using the estimated values of the 

positions and postures of the virtual panoramic camera 

calculated for each camera; 
25 Fig. 3 is a view for explaining transformation 

for reducing a shakiness of a camera; 
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Fig. 4 is a block diagram for explaining the 
functional arrangement of a panoramic video generation 
system according to the first embodiment; 

Fig. 5 is a block diagram showing an example of 
5 the arrangement of a video data collection system 110 
used to collect videos to be saved in a sensed video 
storage unit 10; 

Fig. 6 is a block diagram showing the arrangement 
of an image sensing unit 1101 in detail; 
10 Fig. 7 is a block diagram showing an example of 

the hardware arrangement of an image processing 
apparatus 1 ; 

Fig. 8 is a view showing an example of a radial 
layout of cameras used to sense a panoramic video; 
15 Fig. 9 is a flow chart for explaining a process 

for reducing shakiness upon displaying a panoramic 
video ; 

Fig. 10 is a flow chart for explaining a process 
in an individual camera posture information calculation 
20 unit 20; 

Fig. 11 is a flow chart for explaining a process 
in a virtual panoramic camera posture information 
calculation unit 60; 

Fig. 12 is a flow chart for explaining a process 
25 in a panoramic video composition unit 80; 

Fig. 13 is a flow chart for explaining a process 
in a stabilize unit 100; 



- 11 - 



t « 

CFM03466/P204-0025 

Fig. 14 is a block diagram for explaining the 
functional arrangement of a panoramic video generation 
system according to the third embodiment; 

Fig. 15 is a flow chart for explaining a process 
5 for reducing shakiness upon generation of a panoramic 
video; 

Fig. 16 is a flow chart for explaining a process 
in a synchronized camera posture information 
calculation unit 280; and 
10 Fig. 17 is a flow chart for explaining a process 

in a stabilized panoramic video composition unit 300. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 
Preferred embodiments of the present invention 
15 will now be described in detail in accordance with the 
accompanying drawings . 

Note that an image processing method to be 
explained in the following embodiments can be 
implemented by executing a computer program that 
20 implements that image processing method by an image 
processing apparatus as a computer apparatus. 

[ First Embodiment ] 

This embodiment will explain an image processing 
25 method which reduces shakiness using posture 

information of a virtual panoramic camera, which is 



- 12 - 



• * 

CFM03466/P204-0025 



calculated based on posture information of a plurality 
of cameras , upon displaying a panoramic video . 

A panoramic video generation system according to 
this embodiment will be described below. Fig. 4 is a 
5 block diagram for explaining the functional arrangement 
of the panoramic video generation system according to 
this embodiment . This system includes a video 
collection system 110 and image processing apparatus 1. 
The image processing apparatus 1 comprises a sensed 

10 video storage unit 10 , individual camera posture 

information calculation unit 20 , console 30 , display 
unit 40, individual camera posture information storage 
unit 50, virtual panoramic camera posture information 
calculation unit 60, virtual panoramic camera posture 

15 information storage unit 70, panoramic video 

composition unit 80, panoramic video storage unit 90, 
and stabilize unit 100. 

The sensed video storage unit 10 stores a 
photo-realistic video captured by the video collection 

20 system 110 to be described later in a storage device 
such as a hard disk drive or the like . 

The individual camera posture information 
calculation unit 20 calculates posture information of 
each frame of the video stored in the sensed video 

25 storage unit 10. Details of the process will be 
explained later . 
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The console 30 is used by the user of this system 
to input instructions and the like to the system, and 
comprises input devices such as a mouse, keyboard, and 
the like. The aforementioned individual camera posture 
5 information calculation unit 20 sets camera parameters 
and the like in accordance with operation inputs from 
the console 30 while observing a display on the display 
unit 40. 

The camera posture information storage unit 50 
10 stores the posture information of each camera 

calculated by the individual camera posture information 

calculation unit 20. 

The virtual panoramic camera posture information 

calculation unit 60 calculates posture information of 
15 the virtual panoramic camera by combining the posture 

information of the respective cameras stored in the 

individual camera posture information storage unit 50. 

Details of the process will be explained later. 

The virtual panoramic camera posture information 
20 storage unit 70 stores the posture information of the 

virtual panoramic camera calculated by the virtual 

panoramic camera posture information calculation unit 

60. 

The panoramic video composition unit 80 
25 composites a panoramic video by stitching video frames 
of the same time instant stored in the sensed video 
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storage unit 10. Details of the process will be 
explained later. 

The panoramic video storage unit 90 stores the 
panoramic video composed by the panoramic video 
5 composition unit 80. 

The stabilize unit 100 reduces shakiness of the 
panoramic video stored in the panoramic video storage 
unit 90 using the posture information of the virtual 
panoramic camera stored in the virtual panoramic camera 
10 posture information storage unit 70, and displays that 
video on the display unit 40. 

Fig. 5 is a block diagram showing an example of 
the arrangement of the video collection system 110 used 
to collect video data to be saved in the sensed video 
15 storage unit 10. As shown in Fig. 5 # the video 

collection system 110 comprises three units, i.e., an 
image sensing unit 1101, recording unit 1102, and 
capture unit 1103. The image sensing unit 1101 is used 
to sense a surrounding scene while moving. The 
20 recording unit 1102 is used to record a video output 
from the image sensing unit 1101. The capture unit 
1103 is used to store the collected video data in the 
sensed video storage unit 10 in the image processing 
apparatus 1 . 

25 Note that the image sensing unit 1101 comprises N 

(N 2: 2) cameras 1101-1 to 1101-N and a sync signal 
generator 1104, as shown in Fig. 6. The cameras 1101-1 
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to 1101 -N can receive an external sync signal from the 
sync signal generator 1104. In this embodiment, the 
shutter timings of the cameras 1101-1 to 110 1-N are 
matched using the external sync signal output from the 
5 sync signal generator 1104. The cameras 1101-1 to 

1101-N can be laid out in a radial pattern, as shown in, 
by way of example , Fig . 8 . Also , the viewpoint 
positions of the cameras 1101-1 to 1101-N may be 
matched by reflecting their fields of view by a 

10 polygonal mirror. In either case, the cameras 1101-1 
to 1101-N are firmly fixed in position, so as not to 
change their relative positions and postures . 

The image processing apparatus 1 will be 
described below. Fig. 7 is a block diagram showing an 

15 example of the hardware arrangement of the image 

processing apparatus 1 according to this embodiment. 
The hardware arrangement shown in Fig. 7 is equivalent 
to that of a commercially available, normal personal 
computer. Referring to Fig. 7, a disk 405 which is 

20 represented by, but not limited to, a hard disk, which 
forms the sensed video storage unit 10, and stores 
video data obtained by the video collection system 110 
explained using Figs. 5 and 6. Note that the disk 405 
forms not only the aforementioned sensed video storage 

25 unit 10, but also the individual camera posture 
information 50, virtual panoramic camera posture 
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information storage unit 70, and panoramic video 
storage unit 90 shown in Fig. 4. 

A CPU 401 serves as the individual camera posture 
information calculation unit 20, virtual panoramic 
5 camera posture information calculation unit 60, and 

panoramic video composition unit 80 when it executes a 
program saved in the disk 405, a ROM 406, or an 
external storage device (not shown). 

When the CPU 401 issues various display 

10 instructions to a display controller (CRTC) 402, the 
display controller 402 and a frame buffer 403 make a 
desired display on a display (CRT) 404. Note that Fig. 
7 shows the CRTC as the display controller 402, and the 
CRT as the display 404. The display, however, is not 

15 limited to the CRT, and a liquid crystal display or the 
like may be used. Note that the CRTC 402, frame buffer 
403, and CRT 404 form the display unit 40 shown in Fig. 
4. A mouse 408, keyboard 409, and joystick 410 allow 
the user to make operation inputs to the image 

20 processing apparatus 1, and form the console 30 shown 
in Fig. 4. 

Details of the process in the image processing 
apparatus 1 will be described below. Fig. 8 shows an 
example of the radial layout of the N cameras so as to 
25 explain this method. 

The relative postures of the N cameras 1101-1 to 
1101-N are obtained in advance by, e.g., camera 
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calibration or the like. Then, a virtual panoramic 
camera coordinate system 510 which has as an origin the 
barycentric position of the lens centers of the 
respective cameras is defined to calculate transforms 
between coordinate systems 1111-1 to 1111 -N which have 
as their origins the lens centers of the respective 
cameras , and the virtual panoramic camera coordinate 
system 510, The transform between the coordinate 
systems can be expressed by, e.g., a 4 x 4 matrix. 

The flow of the process for reducing shakiness on 
the basis of a video synchronously sensed by the 
aforementioned cameras upon displaying a panoramic 
video will be explained below using the flow chart of 
that process shown in Fig. 9. 

In step S100, posture information of each of the 
cameras 1101-1 to 1101-N is individually calculated. 
Details of this process will be explained later using 
the flow chart of that process shown in Fig. 10. 

In step S120, the posture information of the 
virtual panoramic camera is calculated based on those 
of the respective cameras 1101-1 to 1101-N. Details of 
this process will be explained later using the flow 
chart of that process shown in Fig. 11. 

In step S140, a panoramic video is composed by 
joining video frames obtained by the cameras 1101-1 to 
1101-N. Details of this process will be explained 
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later using the flow chart of that process shown in Fig. 
12. 

Finally, in step S160 a transform for reducing 
shakiness of the virtual panoramic camera is applied to 
5 each frame of the panoramic video composed in step S140 
on the basis of the posture information of the virtual 
panoramic camera calculated in step S120, thus 
displaying a stabilized ( shakiness -reduced) panoramic 
video on the display device. Details of this process 
10 will be explained later using the flow chart of that 
process shown in Fig. 13. With the above processes, 
shakiness can be reduced upon displaying a panoramic 
video . 

(Calculation of Posture Information of Individual 
15 Camera) 

Details of the process in step S100 in Fig. 9 
will be described below using the flow chart of that 
process shown in Fig. 10. This is also a detailed 
description of the process in the individual camera 
20 posture information calculation unit 20. 

In step S1001, variable n indicating the camera 
number is set to be an initial value "1". 

In step S1002, a video sensed by the camera 
1101-n is acquired by the sensed video storage unit 10. 
25 When the image sensing unit 1101 adopts an arrangement 
for sensing a vertically elongated video by rolling the 
camera 1101-n about 90° about the optical axis, the 
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roll angle is adjusted by rolling the acquired video 
about 90°. On the other hand, when the image sensing 
unit 1101 adopts an arrangement for reflecting the 
field of view of the camera 1101-n by a polygonal 
5 mirror, the influence of the mirror is removed by, e.g., 
inverting the acquired video. 

In step S1003, known camera parameters (e.g., the 
image sensing element size, focal length, and the like) 
of the camera 1101-n are input. 

10 In step S1004, posture information of the camera 

1101-n in each frame is calculated. More specifically, 
a transform from the coordinate system in each 
frame m (m = 2 to M) to that in frame 1 is calculated 
using match moving software or the like. Note that M 

15 represents the total number of frames of the video 

sensed by the camera 1101-n. Also, since the posture 
information can be easily calculated from the transform 
Hnm, the transform represents the posture 
information in this embodiment. 

20 In step S1005, the posture information of the 

camera in each frame is stored in the camera posture 
information storage unit 50. 

In step S1006, n indicating the camera number is 
incremented by "1". It is checked in step S1007 if the 

25 processes for all the cameras are complete. If NO in 
step S1007, the flow returns to step S1002. 
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With the above processes, the posture information 
of each camera in each frame can be individually 
calculated. 

(Calculation of Posture Information of Virtual 
5 Panoramic Camera) 

Details of the process in step S120 will be 
explained below using the flow chart of that process 
shown in Fig. 11. This is also a detailed description 
of the process in the virtual panoramic camera posture 
10 information calculation unit 60 . 

In step S1201, the posture information of 
each camera in each frame is acquired from the camera 
posture information storage unit 50. 

In step S1202 , m indicating the frame number is 
15 set to be an initial value "1". In step S1203, n 

indicating the camera number is set to be an initial 
value "1". 

In step S1204 , the posture information of the 
virtual panoramic camera is calculated for each camera. 

20 That is, as shown in Fig. 1, a transform from the 

coordinate system of the virtual panoramic camera in 
frame m to that in frame 1 is calculated based on H^. 
Assume that the transform represents the posture 

information of the virtual panoramic camera. 

2 5 More specifically, Hvm_n is given by: 
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where H n represents a transform from the coordinate 
system of the camera 1101-n to that of the virtual 
panoramic camera. When the respective cameras are 
fixed in position, and their relative postures do not 
5 change, since H n does not change, H n can be calculated 
in advance. 

In step S1205, posture information Hvn,_ n of the 
virtual panoramic camera in each frame is stored in the 
virtual panoramic camera posture information storage 
10 unit 70. 

In step S1206, n indicating the camera number is 
incremented by "1". It is checked in step S1207 if the 
processes for all camera images are complete. If NO in 
step S1207, the flow returns to step S1204. 

15 In step S1208, respective pieces of posture 

information of the virtual panoramic camera are 
combined. More specifically, the posture information 
of the virtual panoramic camera is calculated based on 
those of the virtual panoramic camera, which are 

20 calculated in correspondence with the number of cameras 
in step S1204. That is, as shown in Fig. 2, a 
transform from the coordinate system of the virtual 
panoramic camera in frame m to that in frame 1 of this 
camera is estimated based on H^ n calculated for the 

25 cameras 1101-1 to 1101-N. 

More specifically, the transforms calculated 
for cameras 1101-1 to 1101-N are transformed into 
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10 



vectors Xvm_ n = (Swt.n, <t>vnun. <Pvn>_n ) , and a vector x™ = (6™, 
<Km> 9vm) is calculated from these N vectors. Then, 
that vector is transformed into a 4 x 4 rotation matrix 
Hvm. Note that 9™ is the roll angle, <|)vm is the pitch 
angle, and (p™ is the yaw angle. 

For example, on a left-hand coordinate system, if 
respective elements of the transform are expressed 

by: 
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formulas for calculating 6vm_rw <|>vm_rw and qpvm_n from the 
transform H VTO _ n are: 



<P V , 



= tan" 



r 22 ) 



tan" 1 - 
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= tan 
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15 Also, the 4x4 rotation matrix H™ is given by: 



cos <p vm 0 sin gj vm 0 V 10 0 0 

0 1 0 0 0 cos<p vm -sin<p vm 0 

-sin<p vm 0 cos<p vm 0 0 sinft, m cos<p vm 0 

0 0 0 1 0 0 0 1 



/ cos 8, 



vm 



- sin & Vi 



0 0\ 



COS0 vm 0 0 

0 1 0 
0 0 1 



sine vm 
. 0 
0 

/ cos <p vm cos d vm + sin <p vm sin <p vm sin 0 vm - cos <p vm sin 6 vm + sin q? vm sin <p vm cos d vm sin <p vm cos 4> vm V\ 

0 0 
cos <p vm cos <p vm 0 



cos 0 vm sin 0, 



- sin <p vm cos d vm + cos <p vm sin 4> vm sin 0^ sin q> vm sin 0 vm + cos <p vm sin 0 vm cos 0^ 



where x^, can be, e.g., the average of x vm _i to x™ N . In 
this case, x™ is given by: 
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On the other hand, in place of simply calculating 
the average of the values x™ x to x^n, x™ may be 
calculated in accordance with required reliability, 
thus further reducing an error of x™. 

Some examples will be explained below. 

The values x^ n (n = 1 to N) can be obtained by 
detecting feature points in an image, tracing these 
points across a plurality of frames, and making 
geometric calculations such as factorization or the 
like based on a set of the traced feature points. In 
this case, the reliability of the value x^ n depends on 
how many frames in average the feature points can be 
traced. That is, reliability r n of x^ n can be 
expressed by: 

r n = f(m n ) 

where m n is the average number of traceable frames , and 
f is an arbitrary function. 

In case of a simple proportionality relation, the 
reliability r n is given by: 

r n = m n 

At this time, assuming that an error is not allowed if 
the average number of traceable frames is less than a 
given threshold value (e.g., 10), when 
r n < 10 
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Xvm_n is inhibited from being used in calculations of 
thus reducing an error of x^. 

Alternatively, Xvm can be estimated more 
accurately when it is calculated as the weighted 
5 average according to the reliabilities of Xvm_i to x^j. 
In this case, x™ is given by: 



N 




N 



The reliability r n of x^ n can be calculated in 
advance in accordance with the direction of each camera, 

10 which forms the video collection system 110, in place 
of being calculated based on the average number m n of 
traceable frames . If an angle the travel direction of 
the video collection system 110 and the optical axis 
direction of each camera make is approximate to 90°, 

15 since the moving amounts of feature points on a frame 
become large, the average number of traceable frames 
tends to decrease. Hence, the reliability r n can be 
set to be a smaller value in advance with increasing 
angle the travel direction or reverse travel direction 

20 and the optical axis direction of each camera make. 

For example, let v s be the vector of the travel 
direction, and v c be the optical axis direction of the 
camera. Then, using their inner product, the 
reliability r n can be expressed by: 

25 r n = |v s *v c |+r n o 
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where r n0 is an arbitrary constant. 

Note that the reliabilities corresponding to the 
respective cameras can be set to be symmetrical about 
the center of the video collection system 110. When an 
5 automobile that mounts the video collection system 110 
travels along the left-hand lane, since cameras on the 
left side in the travel direction have smaller 
distances to an object than those on the right side, 
the average number of frames that can be used to trace 

10 feature points becomes smaller. Hence, the 

reliabilities of the cameras on the left side in the 
travel direction may be set to be lower than those on 
the right side. 

Also, a calculation of the value x^n 

15 corresponding to a camera which makes nearly a right 
angle with the travel direction may be skipped. As a 
result, an error of x™ can be reduced while shortening 
the calculation time. 

Note that the process for calculating the 

20 weighted average of v x^ x to x^ n can calculate a simple 
average or can inhibit some values from being used in 
calculations depending on the method of setting the 
reliability r n used as the weight. That is, if an 
identical r n is set for all M n"s, a simple average is 

25 calculated; if r n is set to be zero for a given n, the 
corresponding value x^^n is not used. 
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The description will revert to the flow chart 
shown in Fig, 11. In step S1209, m indicating the 
frame number is incremented by "1". It is checked in 
step S1210 if the processes for all frames are complete. 
5 If NO in step S1210, the flow returns to step S1203. 

With the above processes, posture information of 
the virtual panoramic camera in each frame can be 
obtained. 

(Generation of Panoramic Video) 

10 Details of the panoramic video generation process 

in step S140 in Fig. 9 will be described below using 
the flow chart of that process shown in Fig. 12. This 
is also a detailed description of the process in the 
panoramic video composition unit 80. A panoramic video 

15 is composed by sequentially executing the following 
processes for a plurality of successive frames. 

In step S1401 , various parameters used in image 
correction and panoramic image generation are read from 
a parameter storage unit (not shown). In. step S1402, n 

20 indicating the camera number is set to be an initial 
value "1". 

In step S1403, a video frame sensed by the camera 
1101-n is acquired from the sensed video storage unit 
10. As described above, when the image sensing unit 
25 1101 adopts an arrangement for sensing a vertically 

elongated video by rolling the camera 1101-n about 90° 
about the optical axis, the roll angle is adjusted by 
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rolling the acquired video about 90° . On the other 
hand, when the image sensing unit 1101 adopts an 
arrangement for reflecting the field of view of the 
camera 1101 -n by a polygonal mirror, the acquired video 
5 is inverted. 

In step S1404, the aspect ratio of the read image 
is corrected. In step S1405, lens distortion is 
corrected. In this embodiment, barrel distortion is 
corrected. In step S1406, an image plane is rotated. 

10 In step S1407, the image is projected from a plane to a 
cylindrical surface in accordance with the field angle 
read from the parameter storage unit (not shown) , thus 
composing a transformed image. 

In step S1408, n indicating the camera number is 

15 incremented by "1". It is checked in step S1409 if the 
processes for all the camera images are complete. If 
NO in step S1409, the flow returns to step S1403. 

Finally, in step S1410 N (equal to the number of 
cameras) transformed images are joined using upper, 

20 lower, right, and left shift amounts, a mixing ratio, 
and the like read from the parameter storage unit (not 
shown). In step S1411, the composed panoramic image is 
stored in the panoramic video storage unit 90. 

Applying the above processes sequentially to 

25 consecutive images, a panoramic video can be composed 
on the basis of various parameters. 
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(Display of a Stabilized Panoramic Video) 

Details of the process in step S160 in Fig. 9 
will be explained below using the flow chart of that 
process shown in Fig. 13. This is also a detailed 
5 description of the process in the stabilize unit 100. 

In step S1601, the posture information of the 
virtual panoramic camera in each frame is acquired from 
the virtual panoramic camera posture information 
storage unit 70. 

10 In step S1602, m indicating the frame number is 

set in accordance with the operation at the console 30. 
In step S1603, frame m of the panoramic video is 
acquired from the panoramic video storage unit 90. 

In step S1604, a transform for reducing shakiness 

15 of the virtual panoramic camera is applied to the 

acquired panoramic image, and a stabilized panoramic 
image is displayed on the display unit 40. 

Note that a transform g for reducing shakiness 
of the virtual panoramic camera can be calculated from 

20 Hvm acquired in step S1601. For example, H vm _ s can be a 
4x4 rotation matrix that expresses a posture vector 
Xvm = ( -0vm, -(J)™, -<Pvm) which is formed based on the roll 
angle 6, pitch angle <j>, and yaw angle g> obtained from 
Hvm- Upon application of such transform, the posture of 

25 the virtual panoramic camera in frame m can be set to 
be substantially equal to that in frame 1. Note that 
the yaw angle need not be corrected by setting = 
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(-Qvm. -<|>vm* 0). In this way, when the image collection 
system 110 is moving while turning, that turn motion 
can be prevented from being removed by the shakiness 
reduction process. 
5 It is finally checked in step S1605 if the 

display process is to end. If NO in step S1605, the 
processes are repeated from step S1602. 

With the above processes, a stabilized panoramic 
video can be displayed. 
10 As described above, according to the first 

embodiment, shakiness can be reduced using the posture 
information of the virtual panoramic camera, which is 
calculated based on those of a plurality of cameras, 
upon displaying a panoramic video. 

15 

[Second Embodiment] 

In the first embodiment, the posture information 
is formed of three kinds of azimuth information: the 
roll angle, pitch angle, and yaw angle. This 
20 embodiment will explain a case wherein the posture 
information includes three-dimensional position 
information in addition to these three kinds of azimuth 
information. 

The flow of the process in the virtual panoramic 
25 camera posture information calculation unit 60 in this 
embodiment is substantially the same as that in the 
flow chart of the first embodiment shown in Fig. 11, 
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10 



15 



except for a combining method of posture information of 
the virtual panoramic camera in step S1208. 

In this embodiment, the transforms H^n 
calculated for the cameras 1101-1 to 1101-N are 
transformed into vectors x^n = ( X^n , Y^n, Z™^, 6vm_n, 
<t>™_n, 9vm_n ) , and a vector x™ = ( 

Xvm # , Z ^ , 6vm # 4*vm # 

cpvm) is calculated from these N vectors. Then, that 
vector is transformed into a 4 x 4 matrix H™. 

For example, on a left-hand coordinate system, if 
respective elements of the transform Hvm_n are expressed 
by: 



'13 



formulas for calculating X™, Y™, Z™, 0vm_n, <J>vm_n, and 
cpvm_n from the transform are: 



X vm n - r l4r 



Y vm n — r 24 



tan" 



'22 



<b = tan _1 



r 23 



r 21 + r 22 



20 



= tan* 



'33 



Also, the matrix H™ is given by: 
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cos <p vm cos 6 vm + sin <p vm sin (f> vm sin d vm - cos $v m sin 9 vm + sin 9?^ sin 4> vm cos sin p vm cos X vm 

cos ^ m sin 8 vm cos ft, m cos d vm 0 y um 

- sin <p vm cos 0 vm + cos <p vm sin ft, m sin 0 vm sin <p vm sin 0 VBJ + cos <p vm sin cos 0 vm cos 9? vm cos ft, m Z vm 

0 0 0 1 

Note that the method of calculating one vector 
Xvro from the N vectors is the same as that explained in 
the first embodiment. 
5 Also, the flow of the process in the stabilize 

unit 100 in this embodiment is substantially the same 
as that in the flow chart in the first embodiment shown 
in Fig. 13, except for the calculation method of the 
transform for reducing shakiness of the virtual 

10 panoramic camera in step S1604. 

In this embodiment , Hvm_ s can be a 4 x 4 matrix 
that expresses a posture vector Xvm = 

(-Xvm, -Yvm, -Zvm, -0vm, -<t>vin# -<Pvm) which is formed based 
on the coordinates X^, Y™, and Zvm, the roll angle 0™, 
15 pitch angle (J)™, and yaw angle qpvm obtained from H^. 

Note that the yaw angle may not be corrected by setting 

Xvm = ( ~ Xvm , — Yvm # — Zvm # — Bvm *■ — ^vm *■ 0 ) • 

As described above, according to the second 
embodiment, even when three-dimensional position 
20 information is included in the posture information, 
shakiness can be reduced upon displaying a panoramic 
video . 

[Third Embodiment] 
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This embodiment will explain an image processing 
method, which calculates the posture information of a 
virtual panoramic camera on the basis of those of a 
plurality of cameras, and reduces shakiness using the 
5 re-calculation results of the posture information of 
the plurality of cameras based on that of the virtual 
panoramic camera upon generation of a panoramic video. 

A panoramic video generation system according to 
this embodiment will be explained first. Fig. 14 is a 

10 block diagram for explaining the functional arrangement 
of the panoramic video generation system according to 
this embodiment . This system includes a video 
collection system 110 and image processing apparatus 2. 
The image processing apparatus 2 comprises a sensed 

15 video storage unit 210, individual camera posture 

information calculation unit 220, console 230, display 
unit 240, individual camera posture information storage 
unit 250, virtual panoramic camera posture information 
calculation unit 260, virtual panoramic camera posture 

20 information storage unit 270, synchronized camera 

posture information calculation unit 280, synchronized 
camera posture information storage unit 290, stabilized 
panoramic video composition unit 300, and stabilized 
panoramic video storage unit 310. 

25 Note that the arrangements of the sensed video 

storage unit 210, individual camera posture information 
calculation unit 220, console 230, display unit 240, 
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individual camera posture information storage unit 250, 
virtual panoramic camera posture information 
calculation unit 260, and virtual panoramic camera 
posture information storage unit 270 are the same as 
5 those of the sensed video storage unit 10, individual 
camera posture information calculation unit 20, console 
30, display unit 40, individual camera posture 
information storage unit 50, virtual panoramic camera 
posture information calculation unit 60, and virtual 

10 panoramic camera posture information storage unit 70 in 
the first embodiment. 

The synchronized camera posture information 
calculation unit 280 re-calculates the posture 
information of each camera on the basis of the 

15 calculation result of the virtual panoramic camera 

posture information calculation unit 260. Details of 
the process will be explained later. 

The synchronized camera posture information 
storage unit 290 stores the posture information of each 

20 camera calculated by the aforementioned synchronized 
camera posture information calculation unit 280. 

The stabilized panoramic video composition unit 
300 executes a stabilized panoramic video generation 
process by joining video data saved in the sensed video 

25 storage unit 210 after shakiness reduction process. 
Details of the process will be explained later. 
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The stabilized panoramic video storage unit 310 
stores a stabilized panoramic video composed by the 
stabilized panoramic video composition unit 300. 
The image processing apparatus 2 will be 
5 explained below. The image processing apparatus 2 

according to this embodiment can be implemented by the 
same hardware arrangement as that of the image 
processing apparatus 1 according to the first 
embodiment shown in Fig . 7 . Note that some individual 
10 arrangements are different. 

The disk 405 forms not only the sensed video 
storage unit 210, but also the individual camera 
posture information 250, virtual panoramic camera 
posture information storage unit 270, and synchronized 
15 camera posture information storage unit 290, and 

stabilized panoramic video storage unit 310 shown in 
Fig. 14. 

The CPU 401 serves as the individual camera 
posture information calculation unit 220, virtual 
20 panoramic camera posture information calculation unit 
260, synchronized camera posture information 
calculation unit 280, and stabilized panoramic video 
composition unit 300. 

The CRTC 402, frame buffer 403, and CRT 404 form 
25 the aforementioned display unit 240. A mouse 408, 

keyboard 409, and joystick 410 allow the user to make 
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operation inputs to the image processing apparatus 2, 
and form the aforementioned console 230. 

Details of the process in the image processing 
apparatus 2 will be described below. As in the first 
5 embodiment, a case will be exemplified wherein N 

cameras are laid out in a radial pattern, as shown in 
Fig. 8. 

The flow of the process for reducing shakiness of 
a video synchronously sensed by the aforementioned 
10 cameras and then composing a panoramic video will be 
explained below using the flow chart of that process 
shown in Fig. 15. 

In step S200, posture information of each camera 
is individually calculated. The contents of this 
15 process are the same as those of the process that have 
been explained using the flow chart of Fig. 10 in the 
first embodiment. 

In step S220, the posture information of the 
virtual panoramic camera is calculated based on those 
20 of the respective cameras. The contents of this 

process are the same as those of the process that have 
been explained using the flow chart of Fig. 11 in the 
first embodiment. 

In step S240, the posture information of each 
25 camera is re-calculated on the basis of that of the 

virtual panoramic camera. Details of the process will 



- 36 - 



CFM03466/P204-0025 



be explained later using the flow chart of that process 
shown in Fig. 16. 

In step S260, a panoramic video is composed by 
joining video frames sensed by the cameras 1101-1 to 
5 1101-N, after reducing shakiness of those video fames. 
Details of the process will be explained later using 
the flow chart of that process shown in Fig. 17. With 
the above processes, shakiness can be reduced upon 
generation of a panoramic video. 

10 

(Synchronized Camera Posture Information Calculation 
Process ) 

Details of an example of the synchronized camera 
posture information calculation process in step S240 in 
15 Fig. 15 will be explained below using the flow chart of 
that process shown in Fig. 16. This is also a detailed 
description of the process in the synchronized camera 
posture information calculation unit 280. 

In step S2401, posture information of the 

20 virtual panoramic camera in each frame is acquired from 
the virtual panoramic camera posture information 
storage unit 270. 

In step S2402, m indicating the frame number is 
set to be an initial value "1". In step S2403, a 
25 transform for reducing shakiness of the virtual 

panoramic camera is calculated from H™. 
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For example, Hvm_s can be a 4 x 4 rotation matrix 
that expresses a posture vector Xvm = (-9™, -<Km* -<Pvm) 
which is formed based on the roll angle 6™, pitch 
angle (Jw and yaw angle (p™ obtained from H™. Upon 
5 application of such transformation, the posture of the 
virtual panoramic camera in frame m can be set to be 
substantially equal to that in frame 1. Note that the 
yaw angle may not be corrected by setting x™ = 
(-9vm, -<|>vm, 0). In this way, when the image collection 

10 system 110 is moving while turning, that turn motion 
can be prevented from being removed by the shakiness 
reduction process . 

In step S2404, n indicating the camera number is 
set to be an initial value "1". In step S2405, a 

15 transform Hnm_s for reducing shakiness of the camera 

1101-n is calculated (see Fig. 3). More specifically, 
Hnm_s can be calculated by: 

Assume that H^s expresses the posture information of 
20 the camera 1101-n in this embodiment, since the posture 
information of the camera 1101-n can be calculated 
based on Hnm_ s . 

In step S2406, the calculated posture information 
Hnnus of the camera is stored in the synchronized camera 
25 posture information storage unit 290. 

In step S2407, n indicating the camera number is 
incremented by "1". It is checked in step S2408 if the 
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processes for all the cameras are complete. If NO in 
step S2408, the flow returns to step S2405. 

In step S2409, m indicating the frame number is 
incremented by "1". It is checked in step S2410 if the 
5 processes for all the frames are complete. If NO in 
step S2410, the flow returns to step S2403. With the 
above processes , the posture information of each camera 
in each frame can be calculated. 
(Generation of a Stabilized Panoramic Video) 

10 Details of an example of the process in step S260 

in Fig. 15 will be described below using the flow chart 
of that process shown in Fig. 17. This is also a 
detailed description of the process in the stabilized 
panoramic video composition unit 300 . In this 

15 embodiment, a stabilized panoramic video is composed by 
sequentially executing the following processes for a 
plurality of successive frames. 

In step S2 6 0 1 , various parameters used in image 
correction and panoramic image generation are read from 

20 a parameter storage unit (not shown). In step S2602, 
posture information of each camera in each frame is 
acquired from the synchronized camera posture 
information storage unit 290 . In step S2603 , n 
indicating the camera number is set to be an initial 

25 value "1". 

In step S2604, a video frame sensed by the camera 
1101-n is acquired from the sensed video storage unit 
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210. As described above, when the image sensing unit 
1101 adopts an arrangement for sensing a vertically 
elongated video by rolling the camera 1101-n about 90° 
about the optical axis, the roll angle is adjusted by 
5 rolling the acquired video about 90°. On the other 
hand, when the image sensing unit 1101 adopts an 
arrangement for reflecting the field of view of the 
camera 1101-n by a polygonal mirror, the acquired video 
is inverted. 

10 In step S2605, the aspect ratio of the read image 

is corrected. In step S2606, lens distortion is 
corrected. In this embodiment, barrel distortion is 
corrected. 

In step S2607, the transform H™ s for reducing 
15 shakiness of the camera 1101-n is applied to the image 
to reduce shakiness. 

In step S2608, an image plane is rotated. In 
step S2609, the image is projected from a plane to a 
cylindrical surface in accordance with the field angle 
20 read from the parameter storage unit (not shown), thus 
composing a transformed image. 

In step S2610, n indicating the camera number is 
incremented by "1". It is checked in step S2611 if the 
processes for all the camera images are complete. If 
25 NO in step S2611, the flow returns to step S2604. 

Finally, in step S2612 N (equal to the number of 
cameras) transformed images are joined using upper, 
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lower, right, and left shift amounts, a mixing ratio, 
and the like read from the parameter storage unit (not 
shown). In step S2613, the composed, stabilized 
panoramic image is stored in the stabilized panoramic 
5 video storage unit 310. 

Applying the above processes sequentially to 
consecutive images, a stabilized panoramic video can be 
composed. 

As described above, according to the third 
10 embodiment, the posture information of the virtual 
panoramic camera is calculated based on those of a 
plurality of cameras, and shakiness can be reduced 
using the re-calculation results of the posture 
information of the plurality of cameras based on that 
15 of the virtual panoramic camera upon generation of a 
panoramic video. 

[Fourth Embodiment] 

In the third embodiment, the posture information 

20 is formed of three kinds of azimuth information: the 
roll angle, pitch angle, and yaw angle. This 
embodiment will explain a case wherein the posture 
information includes three-dimensional position 
information in addition to these three kinds of azimuth 

25 information. 

The flow of the process in the virtual panoramic 
camera posture information calculation unit 260 in this 



- 41 - 



CFM03466/P204-0025 

embodiment is substantially the same as that in the 
flow chart of the first embodiment shown in Fig. 11. 
The combining method of posture information of the 
virtual panoramic camera in step S1208 is different, 
5 however, in that it is the same method as is explained 
in the fourth embodiment . 

The flow of the process in the synchronized 
camera posture information calculation unit 280 in this 
embodiment is the same as that of the flow chart in the 

10 third embodiment shown in Fig. 16. The calculation 

method of the transform Hvm_ s for reducing shakiness of 
the virtual panoramic camera is different, however, in 
that it is the same method that is described in the 
second embodiment. 

15 As described above, according to the fourth 

embodiment, even when three-dimensional position 
information is included in the posture information, 
shakiness can be reduced upon generation of a panoramic 
video . 

20 

[Fifth Embodiment] 

In the first to fourth embodiments, shakiness is 
reduced using the posture information of the camera 
(virtual panoramic camera or each camera) . This 
25 embodiment will explain a case wherein a transform is 
applied to the posture information of the camera, and 
shakiness is then reduced. 
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For example, the posture information of the 

camera (virtual panoramic camera or each camera) may be 

transformed so that a desired posture can be set in the 

first and last frames of a video. In this case, the 

5 posture information of a frame designated with no 

posture may be transformed so that a transformation 

amount changes linearly. For example, let 0 X be the 

roll angle of the camera in the first frame, 0 M be the 

roll angle in the last frame, 81 1 be the desired roll 

10 angle in the first frame, and 0 M ' be the desired roll 

angle in the last frame. Then, a transformed roll 

angle 0 m ' in frame m (m = 1 to M) can be expressed by: 
e , = ff ^ (M-m)(g/-^) + (m-l)(^'-^) 

M-l 

where 0 m is the roll angle before transformation. 

15 Similar calculation formulas can be used for 

elements other than the roll angle. Note that the 
posture information of the camera may be transformed to 
obtain desired postures in some middle frames. For 
example, when a scene change to another panoramic video 

20 is included, a change in posture upon switching the 
video can be reduced. 

An upper limit may be set for the posture 
transformation amount (0 m ' - 0 m ) upon shakiness 
reduction. As a result, when the posture changes 

25 largely, a non-display region of a video can be 

prevented from becoming too large. For example, assume 
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that the roll angle 6 m in frame m is transformed into 
the angle 0m' if the transformation amount has no upper 
limit. In such case, if the upper limit of the 
transformation amount of the roll angle is set at 5°, a 
5 transform value 6 ro " with the upper limit can be 
expressed by: 

e m " = e ro - 5 (if e ro ' - e m < -5) 
e m M = e m f (if |e m ' - e ra | <; 5) 
e m n = e ro + 5 (if 8 ra ' - e ra > 5) 

10 Alternatively, when the maximum value of the 

transformation amount is 12°, a transform value 0 m " 
with the upper limit may be calculated by: 

Of course, elements other than the roll angle can be 
15 obtained by similar calculation formulas. 

To achieve the same object as above, only a 
high-frequency component of shakiness may be reduced in 
place of reducing all shakiness components. For 
example, the high-frequency component can be calculated 
20 by dividing the calculated transformation amount by the 
weighted average of the transformation amounts of 
several frames before and after the frame of interest. 

As described above, according to the fifth 
embodiment, since the posture information of each 
25 camera is transformed, and shakiness is then reduced. 
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the display quality of a panoramic video can be 
improved . 

[Other Embodiments] 
5 The image processing method explained in each of 

the above embodiments may be implemented by either a 
processing apparatus which comprises a single device, 
or a system which comprises a plurality of devices. 

Note that the present invention can be applied to 

10 an apparatus comprising a single device or to system 
constituted by a plurality of devices. 

Furthermore, the invention can be implemented by 
supplying a software program, which implements the 
functions of the foregoing embodiments, directly or 

15 indirectly to a system or apparatus, reading the 

supplied program code with a computer of the system or 
apparatus, and then executing the program code. In 
this case, so long as the system or apparatus has the 
functions of the program, the mode of implementation 

20 need not rely upon a program. 

Accordingly, since the functions of the present 
invention are implemented by computer, the program code 
installed in the computer also implements the present 
invention. In other words, the claims of the present 

2 5 invention also cover a computer program for the purpose 
of implementing the functions of the present invention. 
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In this case, so long as the system or apparatus 
has the functions of the program, the program may be 
executed in any form, such as an object code, a program 
executed by an interpreter, or scrip data supplied to 
5 an operating system. 

Example of storage media that can be used for 
supplying the program are a floppy disk, a hard disk, 
an optical disk, a magneto-optical disk, a CD-ROM, a 
CD-R, a CD-RW, a magnetic tape, a non-volatile type 

10 memory card, a ROM, and a DVD (DVD-ROM and a DVD-R). 

As for the method of supplying the program, a 
client computer can be connected to a website on the 
Internet using a browser of the client computer, and 
the computer program of the present invention or an 

15 automatically- installable compressed file of the 

program can be downloaded to a recording medium such as 
a hard disk. Further, the program of the present 
invention can be supplied by dividing the program code 
constituting the program into a plurality of files and 

20 downloading the files from different websites. In 
other words, a WWW (World Wide Web) server that 
downloads, to multiple users, the program files that 
implement the functions of the present invention by 
computer is also covered by the claims of the present 

25 invention. 

It is also possible to encrypt and store the 
program of the present invention on a storage medium 
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such as a CD-ROM, distribute the storage medium to 
users, allow users who meet certain requirements to 
download decryption key information from a website via 
the Internet, and allow these users to decrypt the 
5 encrypted program by using the key information, whereby 
the program is installed in the user computer. 

Besides the cases where the aforementioned 
functions according to the embodiments are implemented 
by executing the read program by computer, an operating 

10 system or the like running on the computer may perform 
all or a part of the actual processing so that the 
functions of the foregoing embodiments can be 
implemented by this processing. 

Furthermore, after the program read from the 

15 storage medium is written to a function expansion board 
inserted into the computer or to a memory provided in a 
function expansion unit connected to the computer, a 
CPU or the like mounted on the function expansion board 
or function expansion unit performs all or a part of 

20 the actual processing so that the functions of the 
foregoing embodiments can be implemented by this 
processing. 

As many apparently widely different embodiments 
of the present invention can be made without departing 
25 from the spirit and scope thereof, it is to be 

understood that the invention is not limited to the 
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specific embodiments thereof except as defined in the 
appended claims . 



- 48 - 



